Skip to content

Conversation

@venkamita
Copy link
Contributor

Adds a baseline machine learning project to classify bacterial genes as essential or non-essential using DNA sequence data from the macwiatrak/bacbench-essential-genes-dna dataset.
Files Added:
main.py – Implements the Logistic Regression pipeline with 4-mer feature extraction.
requirements.txt – Lists all Python dependencies needed to run the project.
README.md – Project overview, dataset description, preprocessing steps, model evaluation, and usage instructions.
Notes:

Serves as a simple baseline for essential gene prediction.
First ML project attempt; AI was used only for debugging assistance.
Follow-up improvements could include handling class imbalance, overlapping k-mers, and more advanced models.

@github-actions
Copy link

github-actions bot commented Dec 7, 2025

👋 @venkamita
Thank you for raising your pull request.
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.

Copy link
Contributor

@iamwatchdogs iamwatchdogs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and Approved by @iamwatchdogs.

@iamwatchdogs iamwatchdogs changed the title Add baseline essential-gene classifier with main.py, requirements, and README.md Add Essential Gene Classification from DNA Sequences Dec 26, 2025
@iamwatchdogs iamwatchdogs merged commit ea5f8e4 into Grow-with-Open-Source:main Dec 26, 2025
4 checks passed
@github-actions
Copy link

Thank you for contributing @venkamita. Make sure to check your contribution on GitHub Pages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants